Self-organized Hierarchical Softmax

نویسندگان

Yikang Shen

Shawn Tan

Christopher Joseph Pal

Aaron C. Courville

چکیده

We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies. Instead of using a predefined hierarchical structure, our approach is capable of learning word clusters with clear syntactical and semantic meaning during the language model training process. We provide experiments on standard benchmarks for language modeling and sentence compression tasks. We find that this approach is as fast as other efficient softmax approximations, while achieving comparable or even better performance relative to similar full softmax models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Strategies for Training Large Vocabulary Neural Language Models

Training neural network language models over large vocabularies is computationally costly compared to count-based models such as Kneser-Ney. We present a systematic comparison of neural strategies to represent and train large vocabularies, including softmax, hierarchical softmax, target sampling, noise contrastive estimation and self normalization. We extend self normalization to be a proper es...

متن کامل

Self-organized annealing in laterally inhibited neural networks shows power law decay

In this paper we present a method which assigns to each layer of a multilayer neural network, whose network dynamics is governed by a noisy winner-take-all mechanism, an approximated temperature β. This approximated temperature is obtained by comparison of a softmax mechanism where a temperature is well defined with the noisy winner-take-all mechanism. We apply this method to a multilayer neura...

متن کامل

Hierarchical Memory Networks

Memory networks are neural networks with an explicit memory component that can be both read and written to by the network. The memory is often addressed in a soft way using a softmax function, making end-to-end training with backpropagation possible. However, this is not computationally scalable for applications which require the network to read from extremely large memories. On the other hand,...

متن کامل

Exploration of Tree-based Hierarchical Softmax for Recurrent Language Models

Recently, variants of neural networks for computational linguistics have been proposed and successfully applied to neural language modeling and neural machine translation. These neural models can leverage knowledge from massive corpora but they are extremely slow as they predict candidate words from a large vocabulary during training and inference. As an alternative to gradient approximation an...

متن کامل

Leaf-Smoothed Hierarchical Softmax for Ordinal Prediction

We propose a new approach to conditional probability estimation for ordinal labels. First, we present a specialized hierarchical softmax variant inspired by k-d trees that leverages the inherent spatial structure of (potentially-multivariate) ordinal labels. We then adapt ideas from signal processing on noisy graphs to develop a novel regularizer for such hierarchical softmax models. Both our t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1707.08588 شماره

صفحات -

تاریخ انتشار 2017

Self-organized Hierarchical Softmax

نویسندگان

چکیده

منابع مشابه

Strategies for Training Large Vocabulary Neural Language Models

Self-organized annealing in laterally inhibited neural networks shows power law decay

Hierarchical Memory Networks

Exploration of Tree-based Hierarchical Softmax for Recurrent Language Models

Leaf-Smoothed Hierarchical Softmax for Ordinal Prediction

عنوان ژورنال:

اشتراک گذاری